Introduction to HOBIT, a 6- Jet Identification Tagger at 
the CDF Experiment Optimized for Light Higgs Boson 

Searches 

J. Freeman a , T. Junk a , M. Kirby a '\ Y. Oksuzian b , T.J. Phillips c , 
F.D. Snider a , M. Trovato c , J. Vizan f , W.M. Yao d 

a Fermi National Accelerator Laboratory, Batavia, IL, 60510, USA 
b Univeristy of Virginia, Charlottesville, Virginia, 22906, USA 
c Duke University, Durham, North Caronlina, 27708, USA 
d Ernest Orlando Lawrence Berkeley National Laboratory, 
Berkeley, California, 94720, USA 
e Isituto Nazionale di Fisica Nucleare Pisa, Scuola Normale Superiore, 1-56127 Pisa, Ltaly 
Wniversite catholique de Louvain, Louvain la Neuve, B-1348, Belgium 



Abstract 

We present the development and validation of the Higgs Optimized b Iden- 
tification Tagger (HOBIT), a multivariate fe-jet identification algorithm op- 
timized for Higgs boson searches at the CDF experiment at the Fermilab 
Tevatron. At collider experiments, b taggers allow one to distinguish particle 
jets containing B hadrons from other jets; these algorithms have been used 
for many years with great success at CDF. HOBIT has been designed specif- 
ically for use in searches for light Higgs bosons decaying via H — > bb. This 
fact combined with the extent to which HOBIT synthesizes and extends the 
best ideas of previous taggers makes HOBIT unique among CDF 6-tagging 
algorithms. Employing feed-forward neural network architectures, HOBIT 
provides an output value ranging from approximately -1 ("light-jet like") to 
1 ( "6-jet like"); this continuous output value has been tuned to provide max- 
imum sensitivity in light Higgs boson search analyses. When tuned to the 
equivalent light jet rejection rate, HOBIT tags 54% of b jets in simulated 
120 GeV/c 2 Higgs boson events compared to 39% for SecVtx, the most com- 
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monly used b tagger at CDF. We present features of the tagger as well as its 
characterization in the form of 6-jet finding efficiencies and false (light-jet) 
tag rates. 

Keywords: 6-jet identification, 6-tagging, standard model Higgs boson, 
CDF, Tevatron 



1 1. Introduction 

2 At CDF, the search for a light Higgs boson has been a subject of increasing 

3 interest and focus in recent years. While there have been numerous successful 

4 6-jet identification algorithms (commonly referred to as "6 taggers") over the 

5 years, most have been intended for use in analyses other than searches for 
e H —7- bb. Aspects of a given analysis, however, such as the optimal signal-to- 

7 background ratio, or the relative rate of non-6 jets originating from gluons 

8 in the data sample before tagging, can influence whether a tagger is optimal 

9 for the analysis in question. Traditional taggers have tended toward a higher 

10 purity and lower efficiency than would be ideal for Higgs boson searches 

11 given the relatively low cross section of Higgs boson production at Tevatron 

12 energies. While this problem has been circumvented somewhat by taking 

13 the logical OR of several taggers, a more elegant and flexible solution can be 

14 found in the continuous output of a neural network, tunable for each analysis 

15 application. 

16 In this paper, we describe the Higgs Optimized b Identification Tagger 

17 (HOBIT). The strategy used in developing HOBIT is to build upon the 
is strengths of previous CDF b taggers, address their weaknesses, and construct 

19 a new tagger that is highly optimized specifically for finding light Higgs boson 

20 decays. HOBIT produces a continuous output variable, allowing efficiency 

21 and background rejection to be tuned to meet the requirements of a given 

22 search. In the next section, we review some of the general features of b 

23 quark decays used by HOBIT to distinguish jets containing B hadrons from 

24 jets produced by gluons or light quarks (up, down, or strange). Section [3] 

25 then describes some of the previous 6-tagging algorithms used by CDF upon 

26 which HOBIT is built. We then discuss some features of the CDF detector 

27 in Sec. [4], followed by a detailed description of the HOBIT algorithm and 

28 training regimen. The performance of HOBIT as characterized by the 6-jet 

29 tagging efficiency and background rejection rates in data and Monte Carlo 

30 (MC) is presented in Sec. |6j We conclude in Sec. [7j 
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31 2. Physics of 6's from Higgs Boson Decay 

32 Jets containing high-.&r B hadrons such as are created in a light Higgs 

33 boson decay possess several features that distinguish them from jets produced 

34 by light quarks or gluons. The most important of these is the relatively long 

35 lifetime of a B hadron, augmented in the lab frame by its relativistic boost, 

36 which allows it to travel a distance on the order of a millimeteiEl The B 

37 hadron's travel across these macroscopic distances results in a displacement 

38 between the location of the pp collision (the "primary" vertex) and the B 

39 hadron decay (the "secondary", or "displaced" vertex). These displacements 

40 are resolvable by the CDF tracking system, and in particular by its silicon 

41 detector. Almost all information as to whether or not a given jet originates 

42 from 6-quark production is carried in the tracks reconstructed from detec- 

43 tor signals left by the jet's charged particles. Specifically, it is possible to 

44 identify the decay of a B hadron through the displacement from the primary 

45 vertex of the individual tracks it leaves in the detector, and also through 

46 the displacement of a 5-hadron decay vertex formed by combining multiple 

47 displaced tracks in a fit. 

48 Other features also distinguish the b jet from other jets. Due to the large 

49 mass of the b quark, the collective invariant mass of the decay products of 

50 B hadrons will be larger than those from the decay products of hadrons not 

51 containing b quarks. Furthermore, the large relativistic boost typical of a B 

52 hadron will result in decay products which tend to be more energetic and 

53 collimated within a jet cone than other particles. Finally, particle multiplic- 

54 ities tend to be different for jets containing B hadron decays compared to 

55 other jets; in particular, muons or electrons appear in approximately 20% of 

56 jets containing a B hadron, either directly via semileptonic decay of the B or 

57 indirectly through the semileptonic decay of charm hadrons resulting from a 

58 B decay. 

59 3. b- Tagging Algorithms 

eo As a tremendous amount of effort has gone into the construction of b 

ei taggers at CDF and other experiments [H EJ |3], we build upon previous 

62 experience when constructing HOBIT. In particular, HOBIT explicitly uses 



1 This distance is achieved due to the fact that c times the rest frame lifetime of a B° 
{B ± , B s , K b ) hadron is 460 /ttm (501 ^m, 441 /j,m, 367 /im). 
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63 as inputs the output of the SecVtx algorithm set to its "loose" operating 

64 point [4], the output of CDF's soft muon tagger [5], and inputs to the earlier 

65 RomaNN [61 [7j and Bness [8] multivariate taggers. Consequently, it is useful 
ee to describe these taggers. 

67 3.1. SecVtx 

ea SecVtx is a displaced vertex tagger and the most commonly used b tagger 

69 at CDF. SecVtx only uses tracks which are significantly displaced from the 

70 primary vertex, accepted by quality requirements, and within a distance of 

71 AR < 0.4 of the jet axis. Here, AR = y/ A<p 2 + Arj 2 , where is the azimuthal 

72 angle of the track around the beam axis, and rj is its pseudorapidity defined 

73 as rj = — log(tan(|)), with 9 the polar angle of the track with respect to 

74 the beam axis. With these tracks, SecVtx uses an iterative method to fit a 

75 displaced vertex within the jet, where the \ 2 °f the vertex fit is employed 

76 to guide the process. Assuming that this displacement is due to the long 

77 lifetime of the B hadron, the significance of the two-dimensional decay length 

78 L xy in the plane perpendicular to the beampipe axis is used to select 6-jet 

79 candidates. The algorithm is utilized with different track requirements and 
so threshold values in order to achieve different efficiencies and purity rates. In 
si practice, three operating points are used, referred to as "loose" , "tight" , and 

82 "ultra-tight" . The loose SecVtx operating point decision is used as an input 

83 to both the RomaNN and HOBIT tagger. One drawback of the SecVtx 

84 tagger is that it is unable to fit a vertex in every h jet. In the Pythia j9] 

85 120 GeV/c 2 Higgs boson Monte Carlo (MC) whose b jets are used to train 
se HOBIT, SecVtx operating at its "loose" setting fails to find a vertex in 44.3% 
87 of these jets. 

as 3.2. Soft Lepton Taggers 

89 Soft lepton taggers j5] (SLT) take a different approach to b tagging. 

90 Rather than focusing on tracks within a jet, they select B hadron decays 

91 by identifying charged leptons inside a cone around the jet axis. Since the 

92 b semileptonic branching ratio is approximately 10% per lepton flavor, this 

93 class of tagger is not competitive with SecVtx or the other taggers described 

94 below if used alone. However, because a soft lepton tagger does not rely 

95 on the presence of displaced tracks or vertices, it has a chance to identify 

96 b jets that the other methods cannot. In practice, CDF uses only a soft 

97 muon tagger since high-purity electron or r identification within jets is dif- 

98 ficult. HOBIT uses as inputs the number of soft muon tags within a jet 
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99 as well as the momentum transverse to the jet axis of the muon with the 

100 highest-likelihood tag. 

101 3.3. The RomaNN Tagger 

102 The "RomaNN tagger" has been used at CDF in light Higgs boson 

103 searches [51 E] and employs neural network architectures. Neural networks 

104 (NNs) can use as many flavor-discriminating observables as is computation- 

105 ally feasible; hence the efficiency of NN taggers is equal to or greater than 
we that of conventional taggers for a given purity. While the SecVtx tagger 

107 attempts to find exactly one displaced vertex in a jet, the RomaNN tagger 

108 uses a vertexing algorithm that can find multiple vertices, as may be the 

109 case when multiple hadrons decay within the same jet cone (for example, in 
no a B — > D decay). The RomaNN tagger uses several types of NNs: one to 
in distinguish vertices which come from a heavy flavor (B or charm) hadron 
n2 from false vertices or vertices coming from other hadrons; another to identify 
in unvertexed tracks which come from a heavy flavor hadron; and then another 
H4 NN which takes as inputs the output of the first NNs along with other inputs, 
us including the loose SecVtx tag status, the number of SLT-identified muons, 
lie and the vertex displacement and mass information. Distinct versions of this 
in third NN are trained to separate b jets from light jets, charm jets from light 
us jets, and b jets from charm jets; the outputs of these three flavor-separating 
n9 NNs are then used to train a final NN whose output is the RomaNN discrim- 

120 ination variable. The RomaNN tagger not only has superior performance 

121 to that of SecVtx at equivalent purities (see Fig. [5]), but also allows for an 

122 "ultra-loose" operating point yielding greater efficiency, particularly useful 

123 in light Higgs boson searches. 

124 However, the RomaNN tagger is not guaranteed to fit a vertex or to 

125 have sufficient input information to reliably tag a jet. In the event that the 

126 RomaNN tagger fails to receive sufficient information from its inputs, it is 

127 unable to assign an output value to that jet. This is the case with 20.6% of 

128 the b jets in the aforementioned light Higgs boson MC sample. Regardless, 

129 due to the usefulness of the RomaNN inputs, a majority of them are employed 

130 as inputs into the HOBIT tagger, which allows HOBIT to take advantage of 

131 the same extensive vertex information that the RomaNN tagger uses. 

132 3.4- The Bness Tagger 

133 While the RomaNN tagger focuses on the vertices it finds within a jet, in 

134 the event that it is unable to fit any vertices, it is unable to distinguish b jets 
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135 from light jets. However, a significant proportion of b jets (approximately 

136 20% in Higgs boson candidate events) do not contain a sufficient number 

137 of well-reconstructed tracks to allow for a vertex fit in the RomaNN tagger. 

138 The Bness tagger [8] uses not only vertex information within a jet, but also 

139 the properties of individual tracks to determine whether a jet is Mike. (The 

140 RomaNN tagger only examines individual tracks based on their proximity 

141 to a displaced vertex). To evaluate the information from individual tracks, 

142 the Bness tagger utilizes an NN which is applied to all tracks passing loose 

143 requirements, and which takes positional (e.g., impact parameter) and kine- 

144 matic (e.g., p?) information on a track to determine whether it appears to 

145 have come from the decay of a B hadron. The Bness tagger is therefore able 

146 to extract information from all but a few percent of B jets, and can achieve 

147 a very high efficiency for a reasonable level of purity. This robust property 

148 of the tagger makes it useful for analyses where efficiency is critical, as is 

149 the case with light Higgs boson analyses or even searches for hadronic de- 

150 cays of heavy gauge bosons (see Ref . [TU] for more details) . A track-by-track 

151 NN very similar to that employed by the Bness tagger is used to evaluate 

152 tracks in HOB1T; this will be described in Section |5j One drawback of the 

153 Bness tagger is that, like SecVtx and unlike RomaNN, it is only able to fit 

154 one vertex per jet. Additionally, it uses fewer vertex-based inputs than the 

155 RomaNN tagger, and therefore only its track-by-track algorithm is used in 
« HOBIT. 

157 4. The CDF Detector 

158 The CDF 11 detector is described in detail elsewhere [11] . The detector is 

159 cylindrically symmetric around the proton beam lin^Jwith tracking systems 
wo that sit within a superconducting solenoid which produces a 1.4 T magnetic 
lei field aligned coaxially with the pp beams. A set of calorimeters and muon 

162 detectors, to be described later, surround the tracking systems and solenoid. 

163 The outermost tracking system, the Central Outer Tracker (COT), is a 

164 3.1 m long open cell drift chamber which performs up to 96 track position 

165 measurements in the region between 0.40 and 1.37 m from the beam axis, 



2 The proton beam direction is denned as the positive z direction. The rectangular 
coordinates x and y point radially outward and vertically upward from the Tevatron ring, 
respectively. Transverse energy, and transverse momentum are defined as Et—EsuyO, and 
PT=ps'md, respectively, 9 having been defined in Sec. [3| 
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166 providing coverage in the pseudorapidity region \r]\ < 1.0 [12] . Sense wires 

167 are arranged in eight alternating axial and ±2° stereo "superlayers" with 12 

168 wires each. The position resolution of a single drift time measurement is 

169 about 140 /on. 

170 Charged-particle trajectories are found first as a series of approximate line 

171 segments in the individual axial superlayers. Two complementary algorithms 

172 associate segments lying on a common circle, and the results are merged 

173 to form a final set of axial tracks. Track segments in stereo superlayers 

174 are associated with the axial track segments to reconstruct tracks in three 

175 dimensions. 

176 A five layer double-sided silicon microstrip detector (SVX) covers the 

177 region between 2.5 to 11 cm from the beam axis. Three separate SVX barrel 

178 modules along the beam line together cover a length of 96 cm, approximately 

179 90% of the luminous beam interaction region. Three of the five layers combine 
wo an r-(j) measurement on one side and a 90° stereo measurement on the other, 

181 and the remaining two layers combine an r-0 measurement with small angle 

182 stereo at ±1.2°. The typical silicon hit resolution is 11 /mi. Additional 

183 Intermediate Silicon Layers (ISL) at radii between 19 and 30 cm from the 

184 beam line in the central region link tracks in the COT to hits in the SVX. 

185 Silicon hit information is added to COT tracks using a progressive "outside- 

186 in" tracking algorithm in which COT tracks are extrapolated into the silicon 

187 detector, associated silicon hits are found, and the track is refit with the 

188 added information of the silicon measurements. The initial track parameters 

189 provide a width for a search road in a given layer. Then, for each candidate 

190 hit in that layer, the track is refit and used to define the search road into the 

191 next layer. This stepwise addition of precision SVX information at each layer 

192 progressively reduces the size of the search road, while also accounting for the 

193 additional uncertainty due to multiple scattering in each layer. The search 

194 uses all candidate hits in each layer to generate a small tree of final track 

195 candidates, from which the tracks with the best x 2 are selected. The effi- 

196 ciency for associating at least three silicon hits with an isolated COT track is 

197 91 ± 1%. The extrapolated impact parameter resolution for high-momentum 

198 outside-in tracks is much smaller than for COT-only tracks: 40 /mi, domi- 

199 nated by a 30 /mi uncertainty in the beam position. 

200 Outside the tracking systems and the solenoid, segmented calorimeters 

201 with projective geometry are used to reconstruct electromagnetic (EM) show- 

202 ers and jets. The EM and hadronic calorimeters are lead-scintillator and iron- 

203 scintillator sampling devices, respectively. The central and plug calorimeters 
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204 are segmented into towers, each covering a small range of pseudorapidity and 

205 azimuth, and in full cover the entire 2n in azimuth and the pseudorapidity 

206 regions of |7/|<1.1 and 1 . 1 < |?7 1 <3.6 respectively. The transverse energy, Et, 

207 where the polar angle is calculated using the measured z position of the event 

208 vertex, is measured in each calorimeter tower. Proportional chambers and 

209 scintillation detectors arranged in strips measure the transverse profile of EM 

210 showers at a depth corresponding to the shower maximum. 

211 High-momentum jets, photons, and electrons leave isolated energy de- 

212 posits in contiguous groups of calorimeter towers which can be summed to- 

213 gether into an energy "cluster". Electrons are identified in the central EM 

214 calorimeter as isolated, mostly electromagnetic clusters that also match with 

215 a track in the pseudorapidity range |?7| < 1.1. The electron transverse energy 

216 is reconstructed from the measured energy in the electromagnetic cluster 

217 with precision a(E T )/E T = 13.5%/ \jE T (GeV) © 2%, where the © symbol 

218 denotes addition in quadrature. Jets are identified as a group of electro- 

219 magnetic and hadronic calorimeter clusters using the jetclu algorithm [T3] 

220 with a cone size of AR = 0.4. Jet energies are corrected for calorimeter non- 
221 linearity, losses in the gaps betwen towers, multiple primary interactions, the 

222 underlying event, and out-of-cone losses [H] . The jet energy resolution is 

223 approximately oe t = 1-0 GeV + 0.1 x E T . 

224 Directly outside of the calorimeter, four-layer stacks of planar drift cham- 

225 bers detect muons with p? > 1.4 GeV/c that traverse the five absorption 

226 lengths of the calorimeter. Farther out, behind an additional 60 cm of steel, 

227 four layers of drift chambers detect muons with px > 2.0 GeV/c. The two 

228 systems both cover the region |?7| < 0.6, though they have different struc- 

229 tures, and therefore places where the geometrical coverage does not overlap. 

230 Muons in the region 0.6 < \r]\ < 1.0 pass through at least four drift layers 

231 arranged in a conic section outside of the central calorimeter. Muons are 

232 identified as isolated tracks in the COT that extrapolate to track segments 

233 in one of the four-layer stacks. 

234 5. The HOBIT Tagger 

235 The HOBIT tagger is similar to other multivariate 6-tagging algorithms 

236 previously used at CDF, such as the RomaNN and Bness taggers. All of 

237 these taggers attempt to make maximal use of the available information in 

238 b jets, and construct a continuous discriminating variable. HOBIT improves 
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239 upon these earlier taggers, however, by addressing specific weaknesses of each 

240 and optimizing for light Higgs boson searches. 

241 5.1. The architecture 

242 HOBIT is constructed as a feed-forward multilayer perceptron neural net- 

243 work implemented using the TMVA package for Root [15]. It consists of two 

244 hidden layers of 25 and 26 nodes, there being 25 inputs to the tagger, and a 

245 hyperbolic tangent activation function. Five hundred cycles were used in the 

246 training. The training regimen used b jets in Pythia [9] 120 GeV/c 2 Higgs bo- 

247 son Monte Carlo (MC) and light jets from Alpgen-generated Pythia W^+jets 

248 MC. Charm jets were not considered during training due to preliminary stud- 

249 ies which indicated a relative insensitivity of light Higgs boson searches to 

250 charm jet contamination. Here, "6 jet" denotes a jet with a B hadron within 

251 a cone of AR < 0.4 of the jet axis, while a "charm jet" contains a charm 

252 hadron but no B hadrons within this cone and a "light jet" contains neither 

253 B hadrons nor charm hadrons within this cone. Jets were required to have 

254 an Et > 15 GeV, \rj\ < 2, and at least one track for use in the track-by-track 

255 NN described in Sec. 15.31 

256 The 25 inputs to the tagger are a combination of RomaNN and Bness 

257 inputs, albeit with some exceptions, additions and modifications. Fourteen 

258 of these inputs are also inputs to the RomaNN tagger. A further ten inputs 

259 to HOBIT are the ten highest track-by-track NN discriminant output values 

260 of tracks in the jet cone. In the event that there are fewer than ten tracks 

261 in a jet, the value of the remaining track-by-track NN inputs are set to - 

262 1 as this is the light-jet-like value of the NN output. The number of tracks 

263 which pass the track-by-track NN selection criteria is found to have additional 

264 discriminating power and is also used as an input to HOBIT. Track selections 

265 differ between tracks used for RomaNN inputs and tracks evaluated with 

266 the track-by-track NN. Tracks used for RomaNN inputs must have pt > 1 

267 GeV/c and be within AR < 0.4 of the jet axis (the same selection used 

268 in the published RomaNN tagger), while tracks used by the track-by-track 

269 NN inputs had a looser requirement of pt > 0.5 GeV/c and a distance of 

270 AR < 0.7 from the jet axis (the original requirement was AR < 0.4). Other 

271 selection cuts were considered, but none resulted in an improvement in the 

272 performance of HOBIT. Note that one of the RomaNN inputs used (also used 

273 in the Bness tagger) is the Et of the jet itself. The various HOBIT inputs 

274 are correlated with Et, so the Et provides additional useful information to 
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275 HOBIT. We prevent kinematic biasing of HOBIT by weighting the light jet 

276 training sample to have the same Et distribution as the 6-jet training sample. 

277 As previously mentioned, one potential weakness of the RomaNN tagger 

278 is its inability to produce a useable output when there is insufficient input 

279 information. This requirement of "RomaNN taggability" can be a liability 
when very high 6-jet tagging efficiency is sought. In the MC sample used to 
train the HOBIT tagger, 21% of b jets fail to be RomaNN taggable, versus 

282 30% of light jets. The track-by-track NN in HOBIT compensates for this 

283 shortfall of RomaNN. While jets in HOBIT are required to have at least one 
track with an evaluated track-by-track NN output, only 3.0% of b jets and 
2.1% of light jets in the MC fail this requirement, indicating a very efficient 
taggability requirement. 

The full list of inputs to HOBIT ranked by importance after TMVA's 

288 training is provided in Table [IJ Here, "importance" refers to the sum of 

289 the squares of the weights connecting a given input to the nodes of the first 

290 hidden layer of HOBIT. Distributions of the inputs to HOBIT are shown in 
Fig. [TJ A description of these inputs is given below. 



280 
281 
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292 5.2. The RomaNN inputs 

293 RomaNN inputs used in HOBIT consist of observables built using tracks 

294 and vertices found to be "heavy-flavor-like" (HF-like) according to its NNs. 

295 No modifications were made to the RomaNN inputs compared to the pub- 

296 lished tagger. These inputs include: 



297 
298 



304 



306 
307 
308 



The invariant mass, pseudo-cr, 3-d displacement and 3-d displacement 
significance of the most HF-like vertex. 



299 • The number of tracks both in HF-like vertices and standalone HF- 

300 like tracks associated to a displaced vertex, as well as their combined 

301 invariant mass, and the ratio of the scalar sum of the p^s of these 

302 tracks to the scalar sum of the pr's of all tracks in the jet. 

303 • The loose SecVtx tag status, as well as the mass of the tracks used in 
the loose SecVtx vertex fit. 

5.3. Bness inputs: the track-by-track NN 

As mentioned above, the ten highest evaluated track-by-track NN outputs 
for tracks in a jet serve as inputs to HOBIT. Therefore, this section concerns 
the track-by-track NN itself. The input variables to the track-by-track NN 
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309 are the same for HOBIT as were used in the track-by-track NN of the orig- 

310 inal Bness tagger. However, the track-by-track Bness NN was retrained to 

311 create the HOBIT track-by-track NN. This was done not only because the 

312 cone requirement on the tracks was loosened but also because we wished to 

313 optimize the track-by-track NN for light Higgs boson searches. Hence, while 

314 the original Bness track-by-track NN was trained using ZZ — y 4 jets MC, 

315 the HOBIT track-by-track NN was trained using the same MC as was used 

316 to train the overall HOBIT tagger. Since the track-by-track NN operates 

317 at the level of individual tracks, we impose an additional requirement on 

318 6-jet tracks for the purposes of training by demanding that they be within 

319 AR < 0.05 of the actual charged particles resulting from a B hadron decay 

320 in the MC. The track-by-track NN employed the same basic framework for 

321 training as that used for HOBIT itself (training cycles, inner layer structure, 

322 etc.). 

323 Some of the inputs to the track-by-track NN take advantage of the fact 

324 that tracks from B hadron decays are displaced from the primary vertex. 

325 These inputs include the impact parameter, the distance along the z-axis be- 

326 tween the track and the primary vertex, and the significance of each. Kine- 

327 matic inputs such as the pt, rapidity, and track momentum perpendicular 

328 to the jet axis (pperp) exploit the greater collimation of B tracks due to the 

329 large boost of the hadron. Finally, the jet Et is an input to the track-by- 

330 track NN, because the previously mentioned inputs are correlated with jet 

331 Et- Tracks from light jets are weighted in training such that the jets which 

332 contain them have the same Et distribution as the h jets; this is done so 

333 as to avoid kinematic biasing in the track-by-track NN. Distributions of the 

334 track-by-track NN inputs are shown in Fig. [2] Not shown are the jet Et 

335 distributions, which are identical by construction. 

336 5-4- HOBIT Performance 

337 The output HOBIT distributions for 6-jets and light-jets from an inde- 

338 pendent but identically generated MC sample as was used to train the dis- 

339 criminator are shown in Fig. [3] In Fig. [4], the 6-jet efficiencies and the light 

340 jet efficiencies ("mistag rates") as a function of jet Et and r\ are shown for 

341 two HOBIT operating points - a requirement of a HOBIT output > 0.72 

342 ("loose") and a requirement of a HOBIT output > 0.98 ("tight"). At higher 

343 i], where tracking coverage is more sparse and less information is available, 

344 the 6-tagging efficiency drops, as would be expected. Interestingly, the mistag 

345 rate increases in the case of the loose tag and drops in the case of the tight 
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Figure 1: Inputs to HOBIT. The solid histogram is for light quark jets and the dashed 
(colored) histogram is for b jets. Taken from MC, the distributions are normalized to one 
another. Left to right, top to bottom: the Bness value for the 10 highest Bness tracks; 
the number of Bness-selected tracks; the loose SecVtx tag status and the mass of its fitted 
vertex; the number of SLT-tagged muons and the momentum transverse to the jet axis of 
the most SLT- favored muon; jet Et\ the 3-d displacement significance of the most HF-likc 
vertex in RomaNN; the invariant mass, number, and fraction of total track pj< of HF-like 
tracks; the 3-d displacement, pseudo-cr and invariant mass of the most HF-like vertex; 
the number of RomaNN-selected tracks and their total pr- 
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Figure 2: Inputs to track-by-track NN. The solid histogram is for tracks in light quark 
jets and the dashed (colored) histogram is for tracks in b jets; taken from MC, the dis- 
tributions are normalized to one another. Not shown is the jet Et, identical between the 
two distributions by construction. Left-to-right, top-to-bottom: significance of the impact 
parameter and Az between the track and the primary vertex; the values of the impact 
parameter and Az; the px of the track with respect to the beam axis; and the track's 
rapidity and pr with respect to the jet axis. 
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346 tag, demonstrating the higher impact of incorrectly identified tracks when 

347 using a loose tagging requirement. In general, the efficiency increases with 

348 increasing jet Et due to the greater displacement of the B hadron. Similarly, 

349 the light jet efficiency increases, at least in part due to the higher rapidity 

350 and pt of tracks in high-E^ jets. 
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Figure 3: HOBIT outputs. The output is trained so that 1 is b jet-like and -1 is targeted 
to be light jet-like. The black histogram is for light quark jets and the colored histogram 
is for b jets. Taken from MC, the distributions are normalized to one another. 



351 The performance of a tagger is best evaluated by comparing its purity to 

352 tagging efficiency at given operating points. We compare HOBIT's purity 

353 versus efficiency curve to the curves of the Bness and RomaNN taggers and 

354 to the purity versus efficiency performance of SecVtx at both its tight and 

355 loose operating points (Fig. [5]). Here, purity refers to the fraction of light-jets 

356 in ly+jets MC which are not tagged as 6-jets, and efficiency refers to the 

357 fraction of b jets in light Higgs boson MC which are tagged. When evaluating 

358 tag efficiencies, the jets in both the numerator and denominator are required 

359 to have Et > 15 GeV and \r]\ < 2, the same Et and rj requirements as 

360 were placed on the jets in the training of HOBIT. Fig. [5] shows that for a 
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Figure 4: The &-jet and light-jet efficiencies in MC before SF corrections as a function of 
ij and Et- The black triangles are for the looser operating point and the colored triangles 
are for the tighter operating point. 
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361 given purity level, improvement in the absolute efficiency due to HOBIT is 

362 approximately 10% over the Bness and RomaNN taggers, and approximately 

363 1 5% over the SecVtx tagger. 

364 We investigated how much of the improvement in HOBIT over earlier 

365 taggers is due to the optimization on jets that specifically originated from 

366 Higgs boson decays. To study this, we trained NN taggers that take the same 

367 inputs as Bness and RomaNN using W+jets and light Higgs boson MC, then 

368 compared the purity versus efficiency curve with those of the original Bness 

369 and RomaNN taggers, which were trained using ZZ MC and Z+jets MC, 

370 respectively. The results can be seen in Figs. [6] and [7} In the case of the 

371 RomaNN comparison, not only is our retrained RomaNN tagger compared 

372 with the original RomaNN result, but also with RomaNN's b versus light jet 

373 separator. This is because the architecture of RomaNN consisted of three 

374 different NN separators (b versus light, b versus charm, light versus charm) 

375 which fed into the final RomaNN separator. As we retrained using light and 

376 b jets, the comparison of the Higgs-optimized version of the RomaNN tagger 

377 with the original b versus light separator makes for a more fair comparison. 

378 In both the Bness and RomaNN cases, the improvement in absolute efficiency 

379 is approximately 2%. 

380 6. Efficiency and Mistag Scale Factors 

381 In order to be used in a physics analysis, the performance of the HOBIT b 

382 tagger must be calibrated. Historically, MC modeling of 6-tag efficiencies and 

383 mistag rates has not been sufficient to use the uncorrected predictions of the 

384 MC. Instead, we use various techniques to measure the 6-tagging efficiency 

385 and the mistag rate using CDF data. Examples of such techniques applied 

386 to the SecVtx algorithm are using jets containing electrons (therefore HF- 

387 enriched) for measuring the 6-tagging efficiency [16J, and using the rate at 

388 which jets have a displaced vertex reconstructed behind the primary vertex 

389 ("negative tags") to estimate mistags [17] . For the tight SecVtx tagger, 

390 the 6-tag efficiency is found to be well predicted by the MC up to a scale 

391 factor (SF), where SF = 0.96 ± 0.05 for the full CDF dataset. In order to 

392 utilize HOBIT to predict yields in data from MC simulation, a similar level 

393 of uncertainty in HOBIT's SF to that of SecVtx's SF is needed for each 

394 operating point. 

395 An important difference between SecVtx and HOBIT is the absence of 

396 negative tags in HOBIT, meaning the SecVtx mistag calculation technique 
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Figure 5: A comparison of the purity-efficiency tradeoffs for HOBIT versus RomaNN, 
Bness, and SecVtx loose and tight. A significant improvement over prior multivariate 
taggers is seen. 
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Figure 6: A comparison of the purity-efficiency tradeoffs for the original RomaNN tagger 
(as well as its 6-light separator) and our version of the Higgs-optimized RomaNN tagger. 
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Figure 7: A comparison of the purity-efficiency tradeoffs for the original Bness tagger and 
our version of the Higgs-optimized Bness tagger. 
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397 cannot be applied. Instead, we use two new techniques described below 

398 for calibrating 6-tag SFs and providing mistag rates: the "it cross section 

399 method", and the "electron conversion method". 

400 6.1. Scale factors using the ti cross section method 

401 The ti cross section method seeks to calibrate the predicted 6-tagging 

402 efficiency and the mistag rate in MC to match those measured in data using 

403 ti candidate events in a H^+S-or-more-jets sample under the assumption 

404 that the ti cross section is known. The method is based upon a previous 

405 analysis [THj that simultaneously measured the SecVtx 6-tag SFs and the ti 

406 cross section. In that measurement, the rates of singly and double tagged 

407 events provide a constraint which allows the measurement of two unknowns. 

408 A two-dimensional fit was performed to maximize the likelihood of observing 

409 the data counts as functions of the SecVtx 6-tag SF and the ti cross section. 

410 This method has been repurposed such that the ti cross section is now an 

411 input assumption, allowing for the calibration of the HOBIT 6-tag efficiency 

412 and the HOBIT mistag rate. We parameterize the resulting tag rate in the 

413 MC samples as a 5-dimensional matrix, where each element is the measured 

414 rate within a bin of the following five variables: jet Et, jet f], the number 

415 of tracks in the jet, the number of primary vertices in the event, and the 

416 z location of the primary vertex from which the jet is calculated to have 

417 originated. The matrix is similar to the SecVtx mistag matrix [T7] , although 

418 of a lower dimension; the variables it has in common with the SecVtx mistag 

419 matrix have the same binning between the two matrices. For eight different 

420 HOBIT operating points, separate matrices are constructed for b, charm, and 

421 light jets. 

422 The iy+3-or-more-jets sample has an insufficient number of mistags to 

423 calibrate the mistag SF, so we add a W+l jet sample, which before 6-tagging 

424 requirements is almost pure H^+light flavor (LF) events. After b tagging, 

425 the W+l jet sample consists of comparably sized Wbb, Wcc, Wcj, and 

426 mistagged H^+LF events. The background predictions |4J involve scaling 

427 the total H^+jets rate to data and subtracting off the non-H^+jets compo- 

428 nents. The prediction of the ly+HF component of W+jets relies on the HF 

429 i^-factor. This scaling adjusts leading-order theoretical predictions of the 

430 fraction of HF in ly+jets events to account for higher-order corrections. We 

431 find that the W + 1-jet data provides an independent handle on the mistag 

432 SF while the fe-tag SF is constrained by the events with three or more jets. 
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However, the dependence on the HF .fT-factor introduces a systematic un- 
certainty that strongly affects the mistag SF. For low values of the HOBIT 
cut, the mistag rate is relatively high, and the relative contribution to the 
tagged W+l-jet sample from W+HF events is lower. This translates to a 
systematic uncertainty on the mistag SF due to the uncertainty on the HF 
i^-factor that is lower at low HOBIT output values than at high HOBIT 
output values. 

The maximum of the 2-d likelihood for the 6-tag SF and the mistag SF is 
calculated given the observed data and fixed values of the HF i^-factor, the 
ti cross section, and the minimum HOBIT output value. The dependence 
on the HF i^-factor and the ti cross section are then taken as sources of 
systematic uncertainty. We assume a t i = 7.04 ± 0.704 pb [IH], and take the 
HF ^-factor to be 1.4 ± 0.4. 

The fitted 6-tag and mistag SFs are shown in Figures [8] and [9j respectively, 
as functions of the minimum HOBIT output value. The curves represent 
a linear fit to the 6-tag SF as a function of the minimum HOBIT output 
value, and a parabolic fit to the mistag SF. The variation due to a t i is also 
shown, where we take the larger of the two shifts in the result due to an 
increase/decrease in a t i and then symmetrize the uncertainty. 

The determination of the 6-tag and mistag SFs are subject to the same 
sources of systematic uncertainty as a measurement of a t t [20] • Specifically, 
the ti acceptance depends on initial-state radiation and final-state radiation 
(ISR+FSR), parton distribution functions (PDFs), jet energy scale, trigger 
efficiencies and lepton identification efficiencies. The luminosity uncertainty, 
although nearly absent in the results of Ref. [20], also contributes to the 
overall systematic uncertainty. 

For the loose (0.72) and tight (0.98) HOBIT operating points, this method 
yields efficiency SFs of 0.997 ± 0.037 and 0.917 ± 0.069, respectively. The 
mistag rate SFs are 1.391 ± 0.202 and 1.515 ± 0.291. A complete table 
of systematic uncertainties for the efficiency SF is shown in Table |2j and 
for the mistag matrix SF in Table [3} Figures 10, 11 , 12, 13, and 14 show 
validation plots comparing properties of the highest E T jet (HOBIT output, 
and select HOBIT inputs) in WH — > Ivbh candidate events before any b- 
tag requirements or SF corrections are applied for MC versus data. Good 
agreement is seen between MC and data. 
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Figure 8: The measured value of the &-tag scale factor for the HOBIT tagger as a function 
of the minimum HOBIT output value. Variations are shown assuming two values of 
the tt cross section. The straight lines are fits to the SFs assuming the central value of 
the tt cross section, and a t i = 6.336 pb, the more conservative case for the purpose of 
estimating uncertainties. The latter fit has been reflected through the central line to 
obtain a symmetric uncertainty band. 
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Figure 9: The measured value of the mistag scale factor for the HOBIT tagger as a function 
of the minimum HOBIT output value. Variations are shown assuming two values of the 
tt cross section. Parabolas are fit to the results assuming the central value of the tt cross 
section, and for er t j = 6.336 pb. The latter has been reflected through the curve for the 
central value to obtain the depicted uncertainty band. 
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Figure 10: Data versus MC, the HOBIT output distribution of the highest Et jet from 
events in the WH — > Ivbb sample before a requirement of a 6-jet tag. 
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Figure 11: Data versus MC, highest track Bness of the highest Et jet from events in the 
WH — > lubb sample before a requirement of a 6-jet tag. 
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Figure 12: Data versus MC, second highest track Bness of the highest Et jet from events 
in the WH — > Ivbb sample before a requirement of a 6-jet tag. 
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Figure 13: Data versus MC, 3-d displacement significance of most HF-like displaced vertex 
of the highest Et jet from events in the WH — > Ivbb sample before a requirement of a 
b- jet tag. 
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Figure 14: Data versus MC, pseudo-cr of most HF-like displaced vertex of the highest Et 
jet from events in the WH — > Ivbb sample before a requirement of a &-jet tag. 
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468 6.2. Scale factors using the electron conversion method 

469 A second method of calculating the correction for the HOBIT MC re- 

470 sponse involves a modification of the traditional SecVtx efficiency SF algo- 

471 rithm in a way that does not require the concept of a "negative tag" [16J. 

472 However, like the SecVtx technique, this method takes advantage of the HF 

473 enhancement among jets containing electrons, discriminating between HF 

474 and LF jets based upon whether the electron is identified as coming from a 

475 photon conversion. 

The event sample consists of back-to-back dijet events where one jet con- 
tains an electron candidate (the electron jet, or "e-jet"), while its opposite 
jet has no such requirement (the away jet, or "a-jet"). We can label each jet 
originating either from an HF quark ("B") or a light flavor quark or gluon 
("Q") and categorize each event as Nxy, where the e-jet has flavor X and 
the a-jet has flavor Y. Then the total number of events (N e ) is 

N e = N BB + N BQ + N QB + N QQ 

and the HF fraction of the e-jets is 

F B = (N BB + N BQ )/N e . 

Applying a b tag on the e-jet with a tagging efficiency (e e ) and a mistag rate 
(cmis), the number of 6-tagged e-jets (N+) is 

K = e e ■ (N BB + N BQ ) + e e mis ■ (N QB + N QQ ). 

Assuming the fraction of light flavor jets with conversions is f c and the 
conversion finding efficiency is e c for the light flavor jets and e° for the HF 
jets, we can obtain the number of e-jets identified from the conversion N ec 

as 

N ec = e o . (N BB + N BQ ) + e c -f c - (N QB + N QQ ) 
After tagging, the number of 6-tagged conversion e-jets (iV? c ) becomes 

= k ■ e e ■ e° ■ (N BB + N BQ ) + e e mis ■ e c ■ f c ■ (N QB + N QQ ) , 

476 where k is the ratio of the 6-tag efficiency for an HF e-jet identified as a 

477 conversion to that for one that is not. 

The previous two equations allow us to solve for e mis and e e : 

e mi s = (K c - k ■ e° • N e + )/{N ec - e° • N e ■ (k + (1 - k) ■ F B )) 
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and 

e e = (N e + - e mis -N e -(1- F B ))/(N e ■ F B ). 

Here, all terms that are not the mistag and efficiency rates can be counted 
directly in data, taken from MC (k), measured in data (Fb) or both taken 
from MC and/or measured in data (e°). In the case of Fg, we can simply 
use the traditional SecVtx electron method [16] to give us this value. For 
e°, obtaining this quantity from MC is trivial, as we have truth information 
available. To calculate it from data, we look at the rate at which positively 
SecVtx-tagged jets are found to contain conversion electrons and then adjust 
this rate using negatively-SecVtx-tagged jets. 

The resulting tagging efficiency SFs for the loose and tight HOBIT out- 
puts are 0.986 ± 0.066 and 0.949 ± 0.044 respectively, in good agreement 
with the results from the ti method. Some of the largest contributors to the 
systematic component of these uncertainties includes the difference between 
the results when we use the MC-calculated e° versus the data-calculated ver- 
sion and the fact that 6-jets containing electrons tend to leave fewer tracks 
than typical fe-jets. 

The SFs on the mistag rate for the loose and tight HOBIT operating 
points are 1.28 ± 0.17 and 1.42 ± 0.89, respectively, also consistent with the 
results of the ti method. As a check, we compare e-jets in data and MC 



(Figs. 15 and 16), after purifying the HF content by requiring the away jet 
to be tight SecVtx tagged and the electron in the e-jet to not be identified 
as a conversion. The fraction of HF versus light jet MC used in these plots 
is determined via a fit of MC templates to the HOBIT distribution in data. 

6.3. SF Combination 

When combining the correction SFs for the MC 6-tag efficiency from the 

502 electron and ti method, we obtain 0.993 ± 0.032 (for HOBIT's loose operating 

503 point, 0.72) and 0.937 ± 0.037 (HOBIT's tight operating point, 0.98). The 

504 combined results for the mistag rates are 1.331 ± 0.130 and 1.492 ± 0.277, 

505 respectively. Due to the uncertainties in the electron and ti methods being 

506 uncorrelated, the combination is straightforward. This results in a greater 

507 than 25% reduction in the size of the uncertainty on the 6-tag efficiency 

508 in comparison to the previous most widely used CDF b-tagging algorithm, 

509 SecVtx. 
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Figure 15: HOBIT output for electron jets, data versus MC. Relative proportions of HF 
to light jets are determined via a fit of the two MC templates to the data. 
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Figure 16: Comparison of select HOBIT inputs for electron jets, data versus MC. 
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510 7. Conclusion 

511 We have developed an NN-based b identification tagger which improves 

512 upon the best ideas of previous CDF taggers, has a very generous taggability 

513 requirement, and has been optimized for H — >■ bb searches, the primary decay 

514 channel of the light Higgs boson at the Tevatron. Using two uncorrelated 

515 and innovative methods, we found tagging efficiencies, mistag rates, and 

516 data-to-MC scale factors that are in good agreement. The combination of 

517 these methods results in a greater than 25% reduction in the 6-tag efficiency 
sis uncertainty compared to SecVtx, the previous most widely used CDF b- 

519 tagging algorithm. In the current light Higgs boson analyses at CDF, we 

520 estimate that replacing previous tagging algorithms with HOBIT results in 

521 a 10-20% improvement in Higgs boson sensitivity. 
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Jet (HOBIT) Input 


Importance 


RomaVtx pseudo-cr 


435 


T~% i n oil* 1 j_* • r* 

RomaVtx 3-d displacement significance 


382 


Bness 


77.5 


Bness 1 


21.5 


oecvtx Loose 


16.9 


Bness 3 


9.90 


IN umber of muons 


7.80 


ptFrac 


7.05 


Bness 2 


6.22 


Bness 4 


5.46 


muon pt to jet axis 


5.32 


Bness 5 


4.54 


Bness 9 


A A R 
4.40 


M inv of HF-like tracks 


4.17 


Bness 6 


3.44 


Bness 8 


2.70 


RomaVtx 3-d displacement 


2.24 


SecVtx Mass 


1.68 


Bness 7 


1.51 


RomaVtx Mass 


0.752 


Number of track-by-track NN tracks 


0.380 


Number of HF-like tracks 


0.287 


Jet E T 


0.161 


Number of Roma-selected tracks 


0.125 


Total p T of tracks 


0.00250 



Tabic 1: Inputs to the HOBIT tagger and their importances; ranking is done by importance 
(see text for definition of this term) . "RomaVtx" denotes the most HF-like vertex as found 
by the RomaNN tagger. 
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Table 2: The systematic uncertainties for the 6-jet tagging efficiency scale factor from the 
a(tt) method measurement. This uncertainty must be combined with the electron method 
scale factor uncertainty; the two should be treated as uncorrelated. The uncertainties 
shown below are absolute shifts. 



b-eff SF a(tt) method 


HOBIT Operating Point 


source 


Loose 


Tight 


a(tt) 


up 
down 


-0.011 
0.011 


-0.019 
0.019 


luminosity 


up 
down 


-0.004 
0.007 


-0.055 
0.012 


jet energy scale 


up 
down 


-0.005 
0.005 


-0.007 
0.007 


generator 


up 
down 


0.003 
-0.003 


0.005 
-0.005 


ISR/FSR 


up 
down 


-0.001 
0.001 


-0.001 
0.001 


t — > Wb branching ratio 


up 
down 


-0.001 
0.001 


-0.001 
0.001 


Trigger 


up 
down 


-0.001 
0.001 


-0.001 
0.001 


PDF 


up 
down 


0.001 
-0.001 


0.001 
-0.001 


W+j kfactor 


up 
down 


0.009 
-0.009 


0.006 
-0.006 


Statistics 


up 
down 


0.014 
-0.014 


0.008 
-0.008 


total 


up 
down 


0.022 
-0.022 


0.026 
-0.026 
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Table 3: The systematic uncertainties for the mistag rate scale factor from the a(tt) 
method measurement. This uncertainty must be combined with the electron method scale 
factor uncertainty; the two should be treated as uncorrelated. The uncertainties shown 
below are absolute shifts. 



mistag SF a(tt) method 


HOBIT Operating Point 


source 


Loose 


Tight 


a(tt) 


up 
down 


0.007 
-0.007 


0.090 
-0.090 


luminosity 


up 
down 


0.004 
-0.004 


0.055 
-0.055 


jet energy scale 


up 
down 


0.003 
-0.003 


0.037 
-0.037 


generator 


up 
down 


0.002 
-0.002 


0.023 
-0.023 


ISR/FSR 


up 
down 


0.000 
-0.000 


0.005 
-0.005 


t — > Wb branching ratio 


up 
down 


0.000 
-0.000 


0.005 
-0.005 


Trigger 


up 
down 


0.000 
-0.000 


0.005 
-0.005 


PDF 


up 
down 


0.000 
-0.000 


0.005 
-0.005 


W+j kfactor 


up 
down 


-0.091 
0.055 


-0.135 
0.081 


Statistics 


up 
down 


0.024 
-0.024 


0.125 
-0.125 


total 


up 
down 


0.094 
-0.060 


0.217 
-0.180 
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Table 4: The systematic uncertainties for the 6-jet tagging efficiency scale factor from the 
electron method measurement. This uncertainty must be combined with the a(tt) method 
scale factor uncertainty; the two should be treated as uncorrelated. The uncertainties 
shown below are absolute shifts. 



b-eff SF electron method 


HOBIT Operating Point 


source 


Loose 


Tight 


over cff. 


up 
down 


0.009 
-0.009 


0.014 
-0.014 


prescale coor. 


up 
down 


0.001 
-0.001 


0.011 
-0.011 


Et depend. 


up 
down 


0.010 
-0.010 


0.003 
-0.003 


semi-lep bias 


up 
down 


0.010 
-0.010 


0.006 
-0.006 


charm model 


up 
down 


0.001 
-0.001 


0.002 
-0.002 


Stats 


up 
down 


0.016 
-0.016 


0.018 
-0.018 


total 


up 
down 


0.023 
-0.023 


0.026 
-0.026 
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Table 5: The systematic uncertainties for the mistag rate scale factor from the electron 
method measurement. This uncertainty must be combined with the a(tt) method scale 
factor uncertainty; the two should be treated as uncorrelated. The uncertainties shown 
below are absolute shifts. 



b-eff SF electron method 


HOBIT Operating Point 


source 


Loose 


Tight 


over cff. 


up 
down 


0.024 
-0.024 


0.092 
-0.092 


prescale coor. 


up 
down 


0.010 
-0.010 


0.003 
-0.003 


Et depend. 


up 
down 


0.014 
-0.014 


0.018 
-0.018 


semi-lep bias 


up 
down 


0.040 
-0.040 


0.055 
-0.055 


charm model 


up 
down 


0.001 
-0.001 


0.004 
-0.004 


Stats 


up 
down 


0.078 
-0.078 


0.163 
-0.163 


total 


up 
down 


0.092 
-0.092 


0.196 
-0.196 



40 



